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1.  Introduction 

The  primary  purpose  of  this  paper  is  to  apply 
maximum  likelihood  estimation  (MLE)  of  non¬ 
linear  specifications  of  the  probability  of  labor 
force  participation  (LFP)  of  female  family  heads 
with  dependent  children  present,  the  Aid  to 
Families  with  Dependent  Children  (AFDC)  pop- 
ulation-at-risk.  Despite  the  fact  that  the  method 
of  MLE  has  been  available  for  decades,  MLE  of 
highly  non-linear  specifications  is  not  common. 
Although  methods  for  the  analysis  of  qualitative 
data  have  been  discussed  for  many  years,  analysts 
continue  to  use  inappropriate  methods  to  esti¬ 
mate  inappropriate  functional  specifications. 
One  explanation  is  that  computer  software  for 
MLE  solution  of  highly  non-linear  functions  is 
not  trivial. 

A  secondary  purpose  of  this  paper  is  to  pre¬ 
sent  a  theoretical  justification  for  the  use  of  a 
sigmoid  shaped  function  when  estimating  a  prob¬ 
ability  like  the  labor  force  participation  rate 
whether  or  not  confronted  with  a  dichotomous 
dependent  variable.  The  sigmoid  specification 
closely  agrees  with  the  shape  expected  for  a  la¬ 
bor  force  participation  function  and  is  logically 
consistent  with  a  probability  interpretation 
while  the  linear  probability  function  is  not. 

A  method  which  has  frequently  been  used  to 
help  circumvent  the  inherent  inconsistency  of 
predicting  a  non-linear  phenomenon  with  a  strict 
linear  model  is  the  use  of  categorical  explanatory 
variables.  As  a  practical  matter,  use  of  dummy 
variables  in  linear  regression  is  often  easier  and 
less  costly.  Also,  much  of  the  available  data  in 
the  past  has  been  reported  categorically.  Econo¬ 
mists  were  forced  to  use  these  categorical  data 
rather  than  a  better  continuous  measure.  Habits 
are  hard  to  break.  At  the  same  time,  goodness- 
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of-fit  does  not  solve  the  problem  that  standard 
tests  of  significance  are  not  appropriate  within 
the  context  of  a  dichotomous  dependent  vari¬ 
able  even  given  robust  tests;  MLE  solves  that 
problem.  Furthermore,  the  practical  considera¬ 
tion  between  MLE  and  linear  regression  will  dis¬ 
appear  as  large  scale  MLE  programs  become  more 
readily  available.  Direct  MLE  of  a  nonlinear  spec¬ 
ification  avoids  inherent  inefficiencies  of  trans¬ 
formations  designed  to  smooth  an  estimated  lin¬ 
ear  probability  specification  into  a  nonlinear 
form. 

Many  past  applications  have  relied  on  ordi¬ 
nary  least  squares  (OLS)  to  obtain  estimates  of 
a  linear  probability  model  in  both  the  case  of  a 
dichotomous  variable  [9,11]  and  when  using  la¬ 
bor  force  participation  rates  [6,  8].  Nerlove  and 
Press  [14]  presented  a  cogent  theoretical  argu¬ 
ment  for  MLE  of  the  logistic  specification.  They 
also  presented  a  program  for  MLE  of  the  logistic 
as  well  as  some  empirical  applications.  Gunder¬ 
son  [10]  using  a  dichotomous  variable,  has  re¬ 
cently  compared  the  estimated  probability  of 
trainee  retention  after  training,  comparing  the 
OLS  linear  probability,  MLE  probit,  Orcutt 
transformation  of  the  linear  probability,  Theil 
transformation  of  the  logistic,  and  Warner  trans¬ 
formation  of  the  linear  probability.  Gunderson 
applied  MLE  only  in  the  case  of  the  probit  while 
noting  that  the  transformations  do  not  eliminate 
the  inherent  inefficiencies  of  OLS  estimation  of 
the  linear  probability  function. 

Aigner  [I]  has  recently  applied  MLE  as  an  al¬ 
ternative  to  OLS  and  the  use  of  instrumental 
variables  when  estimating  a  labor  supply  func¬ 
tion  from  data  similar  to  the  CPS.  Amemiyaand 
Boskin  [3]  have  applied  MLE  in  the  case  where 
the  dependent  variable  is  truncated  lognormal. 
But  there  have  been  very  few  applications  of 
MLE  of  the  probability  of  an  event  when  the  de- 
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pendent  variable  is  binary. 

The  three  specifications  used  in  this  study  are 
presented  in  Section  2:  the  linear  probability 
model;  the  logistic;  and  that  known  as  Urban’s 
curve.  Section  3  presents  a  rationale  for  the  sig- 
noid  specification  by  aggregating  the  individual’s 
labor  force  participation  decision.  Section  4  de¬ 
scribes  the  data  and  variables  used  in  this  study. 
Section  5  presents  the  empirical  results  and  a 
comparison  between  the  specifications.  Section 
6  offers  a  brief  summary  of  results. 

2.  The  Models 

Labor  force  participation  is  a  qualitative  char¬ 
acteristic.  An  observation  consists  of  noting 
whether  the  characteristic  is  present.  Thus,  the 
dependent  variable,  designated  as  Y,  is  dichoto¬ 
mous  and  takes  a  value  of  1  if  the  family  head 
had  a  job  or  was  looking  for  work  and  a  value 
of  0  if  not  in  the  labor  force.  A  natural  way  to 
proceed  is  to  estimate  Pr(Y  =  1,  X)  =  LFPR(X) 
=  0,  where  LFPR  denotes  the  labor  force  partic¬ 
ipation  rate,  X  a  set  of  stimuli,  and  0  a  probabil¬ 
ity.  The  predicted  value  of  the  dependent  vari¬ 
able  can  be  alternatively  interpreted  as  the  prob¬ 
ability  of  participation  for  an  individual  or  as  the 
labor  force  participation  rate  for  individuals  with 
like  characteristics. 

The  probability  of  labor  force  participation 
can  be  considered  as  the  parameter  6t  in  a  family 
of  distributions 

f(?h  Bd  -  -  fli)1"*;  x/  =  o,  i  (1) 

where  0f  are  assumed  to  be  a  function  of  the  vari¬ 
ables  Xiu  . . . ,  Xik.  The  estimation  of  0f  can  be 
obtained  from  a  series  of  TV  observations  [y^Xn, 
....  Xih  ] ,  /=  1, . . . ,  N.  For  example, 

0*  =  0,(*n . Xu)  (2) 

is  the  probability  that  the  ith  individual  will  be 
participating  in  the  labor  force  when  character¬ 
ized  by  tire  variables  Xu . ;  that  is,  0,-  = 

PHY i  =  1;  Xj)  or,  alternatively,  the  expected 
proportion  from  a  set  of  persons  confronted 
with  like  stimuli  that  will  be  participating  in  the 
labor  force. 


The  implication  of  this  model  is  that  repeat¬ 
ed  trials  on  individuals  with  the  same  character¬ 
istics  will  produce  some  successes  and  some  fail- 
ures  in  accordance  with  the  Bernoulli  parameter, 
0,-.  This  may  be  contrasted  with  a  discriminant 
model  where  two  immutable  populations,  suc¬ 
cesses  and  failures,  exist  and  the  problem  is  to 
classify  individuals  into  one  or  the  other. 

The  empirical  problem  is  to  obtain  estimates 
for  the  0(  in  (2).  A  linear  probability  model  spec¬ 
ifies  that 

9i  =  E(Yi;Xi)=Pr(Yl  =  l-,X) 

.  =f}0+hjXu,  i=\ _ N  (3) 

1 

However,  there  is  nothing  inherent  in  uncon¬ 
strained  linear  regression  estimation  of  0f  that 
guarantees  that  the  predicted  values  will  fall  in 
unit  interval.  The  predicted  value,  0,-,  can  be 
reconciled  with  the  probability  interpretation 
by  applying  the  following  rule,  where  0*  is  the 
predicted  probability: 

0f  =  l  if0',->  1  ; 

0  /  =  0/  if  0  <  0,-  <  1  ; 

0?  =  Oif0,<O. 

This  artificial  rule  circumvents  the  fact  that  the 
least  squares  estimate  extends  outside  the  unit 
interval,  but  the  estimates  are  no  longer  unbiased 
for  Qj. 

A  second  weakness  in  applying  linear  regres¬ 
sion  to  (3)  is  that  the  error  hastliscrete  distribu¬ 
tion  and  had  a  diagonal  covariance  matrix  with 
elements  [0,(1  -0,)]  along  the  diagonal.  Because 
of  the  changing  variance  of  the  error,  the  OLS 
coefficients  estimator,  although  unbiased,  is  not 
efficient.  Under  heteroscedasticity  the  standard 
tests  of  significance  do  not  apply.  McGillivary 
[13]  has  shown  that  0,(1  -  0,)  is  a  consistent 
estimator  of  the  variance  of  the  error,  but 
0j(l  —  0,-)  may  be  negative.  An  application  of 
weighted  least  squares  (WLS)  is  limited  to  tak¬ 
ing  those  predicted  values  from  the  OLS  esti¬ 
mates  that  lie  inside  the  unit  interval  causing  a 
loss  in  the  number  of  observations. 
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In  summary,  empirical  difficulties  arise  in 
treating  the  quantal  response  model  as  a  linear 
probability  regression  model.  Since  the  distur¬ 
bances  are  non-normal  and  heteroscedastic,  even 
the  asymptotic  use  of  the  standard  estimators 
and  test  statistics  is  questionable.  The  single  ad¬ 
vantage  to  the  linear  regression  model  is  that  the 
computational  procedure  is  relatively  simple. 

n«wr«  i 
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Figure  I  illustrates  the  seriousness  of  misspecifi- 
cation.  If  the  true  function  is  sigmoid,  the  di¬ 
chotomous  observations  represented  by  .the 
small  circles  will  result  in  an  OLS  fit  as  shown. 
Not  only  will  the  estimated  function  extend  out¬ 
side  the  unit  interval,  but  the  estimated  relation¬ 
ships  between  0  and  the  stimuli  are  likely  to  be 
seriously  biased  not  only  at  the  extremes. 

The  logistic  model  specifics  that 

di  =  1/[I  +  exp  [-(ft,  +  £0;^//)]  ]  » 

/=  1, ...  ,7V  (4) 

which  has  been  proposed  by  Berkson  [5]  and 
Theil  [16]  within  the  contexts  of  bio-assay  and 
information  theory  respectively.  The  primary 
advantage  of  (4)  is  that  0,  is  bounded  by  tire  val¬ 
ues  of  zero  and  one. 

To  estimate  0  in  (4)  by  linear  regression,  the 
transformation 

In  [0,-/(l  -0,)]  =0O  +  jtfiXij, 

,N  (5) 


requires  sample  observations  of  0,.  When  con¬ 
fronted  with  single  observations  of  Yt  for  each 
X-„  observations  at  different  values  of  X,  must 
be  combined  into  classes  and  the  relative  fre¬ 
quencies  fg  for  each  class  computed: 

lnlfeKl-fg)]  =0o  +jtfiXej  +  Vg- 

g=  1 . G  (6) 

with  G  denoting  the  number  of  classes,  and  each 
Xgj  as  the  mean  of  the  observations  XVl  in  the 
gth  class.  This  LOGIT  specification  exhibits 
heteroscedasticity  due  to  unequal  sized  groups 
[12,  16] .  This  suggests  that  (6)  be  estimated  by 
WLS.  The  grouping  technique  tends  to  drastical¬ 
ly  reduce  the  sample  size,  and  the  detail  con¬ 
tained  in  micro-data  will  be  reduced.  Aggregation 
error  may  also  become  a  problem.  There  is  also 
a  problem  of  appropriate  grouping  since  fg  can¬ 
not  be  allowed  to  be  zero  or  one,  but  large  size 
groups  reduce  the  effective  sample  size.  Finally, 
this  model  does  not  provide  a  least  squares  solu¬ 
tion  for  the  0,,  but  rather  for  a  quite  arbitrary 
non-linear  transformation,  In  [0,7(1  —  0,)]  - 
As  an  alternative  to  the  logistic  that  is  also 
mathematically  constrained  to  the  unit  interval, 
the  Urban ’s  curve  model  specifies  that 

0,  =  0.5  +  [tan" 1  (0O  +  J ft.Yy)]  In  ,  (7) 

which  leads  to  the  transformation, 

tan  [(20,  -  1)tt/2]  =0o  +  £ fiXv; 

i=  1, ...  ,7V  (8) 

Clearly,  the  same  problems  exist  as  with  the  lo¬ 
gistic  model,  e.g.,  data  must  be  combined  into 
classes  to  use  linear  regression  techniques.  Ash¬ 
ton  [4]  compared  the  Urban’s  and  logistic  trans¬ 
formations,  as  well  as  the  probit  and  sine  trans¬ 
formations,  and  found  that  the  Urban’s  curve 
approached  the  limits  of  the  unit  interval  slowly 
compared  to  the  other  sigmoid  transformations 
which  were  all  similar  over  the  whole  range. 

Maximum  likelihood  estimation  seems  to  be 
an  appropriate  technique  to  estimate  the  param- 
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eters  in  (4)  or  (8)  above.  The  MLE  has  many  de¬ 
sirable  large  sample  properties,  including  con¬ 
sistency,  asymptotic  unbiasedness  and  efficien¬ 
cy,  and  invariance  [12].  Given  a  random  sample, 
the  likelihood  function  is  given  by 

L(t}0,Pi,---,Pk.y)  =  n/Oi,  0.)  ,  (9) 

F 1 

or 

1  .  (10) 

Given  Bernoulli  observations  for  the  logistic  (4) 
or  the  alternative  Urban’s  curve  (8)  the  log-  like¬ 
lihood  is 

In  m  =  $  \y, InO,  +  (1  -  j/|)ln(l  -  0,)]  -(1 1) 

f*t 

The  problem  is  to  determine  the  values  of  p0, 
. . .  ,Pk  to  maximize  In L.  The  set  of  ( k  =  1) 
equations  3  In  L/dPj  -  0,  /'  =  0, . . . ,  k,  are  tran¬ 
scendental  and  there  is  no  closed  form  solution 
for  p.  This  leads  to  the  use  of  an  iterative  ascent 
method  such  as  the  first-order  gradient  method, 
Newton’s  second-order  method  using  both  the 
gradient  and  Hessian,  or  the  method  of  scoring, 
where  the  Hessian  is  replaced  by  its  expectation 
[12]. 

Experience  with  all  three  of  these  classes  of 
ascent  methods  indicates  that  the  method  of 
scoring  is  computationally  the  most  acceptable 
from  the  standpoint  of  successful  convergence 
in  reasonable  time.  The  method  of  scoring  pro¬ 
vides  successive  estimates  0(;),  /3(;+i),  ....  ac¬ 
cording  to  the  equations  /?(;)  =  ^(i+i)  —  A'H 
where  H  is  the  (k  +  1)  by  (k  +  1)  matrix  whose 
elements  are  E(d2  In  L/dPjdPm)  and  Ais  the  gra¬ 
dient  of  In  L  whose  elements  are  Sin  L/fify. 

Even  with  the  best  of  methods,  convergence 
to  the  MLE  can  be  problematic  as  the  size  of  P 
increases  for  a  highly  non-linear  model  such  as 
(4)  or  (8).  The  topic  of  iterative  MLE  in  highly 
non-linear  models  has  been  investigated  by 
Brown  [7].  He  has  developed  computer  methods 
leading  to  successful  convergence  for  large  di¬ 
mensionality  and  highly  non-linear  models. 


These  techniques  were  used  for  the  present  spec¬ 
ifications. 

3.  The  Participation  Decision 

Assume  that  a  potential  labor  market  entrant 
wishes  to  maximize  his  expected  utility  from  the 
net  present  value  between  occupations.  What  is 
needed  first  is  to  rank  order  the  present  dollar 
value  of  any  number  of  occupational  alternatives 
to  the  alternative  of  not  participating  in  the  la¬ 
bor  force.  Consider  a  decision  by  the  ith  individ¬ 
ual  to  participate  in  the  labor  force  during  the 
time  interval  (t,  T).  Let  be  the  discounted 
net  monetary  gain  from  participation  in  the  jth 
occupation  for  which  that  individual  is  qualified. 

The  net  present  value  of  the  jth  occupation 
can  be  expressed  as: 

ft}  =  Jrr0)  exp (-P/)*  -  Jtw(0  exp  (~pt)dt 
-C  exp  (~pi)  (12) 

where 

r(t)  =  p(t)r°+  [1  -p(t)]r*\ 
p(t)  =  probability  of  employment  in  jth 
occupation; 

r°  =  market  wage  rate  for  jth  occupation; 
r*  =  unemployment  compensation  rate 
which  may  be  zero;1 

p  =  subjective  discount  rate  for  ith  indi¬ 
vidual; 

w(r)  =  p(t)w*(Q,  the  welfare  payment  re¬ 
duction  associated  with  the  jth  job 
if  the  ith  person  is  employed;  and 
C  =  fixed  cost  of  entry  into  the  jth  occu¬ 
pation. 

The  value  of  welfare  over  the  period  for  the  ith 
person  is 

ft  m + 1  =  fj  W)  exp  ( ~pt)dt  ( 1 3) 

where  W(t)  is  the  welfare  payment. 

While  all  of  the  above  equations  are  expressed 

’The  unemployment  benefit  is  discounted  over  the 
whole  period  since  it  represents  a  potential  wage  sub¬ 
stitute  even  if  never  received.  Moreover,  it  is  paid  by 
the  employer  and  would  most  likeiy  be  passed  on  in  the 
form  of  a  higher  money  wage  if  it  were  not  required. 
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in  terms  of  dollars,  the  ith  individual’s  utility 
may  be  substituted  for  dollars,  assuming  a  mono¬ 
tonic  utility  function,  without  loss  of  generality. 

If  52*  is  the  largest  value  from  the  set  52 
. . ,  52^,  then  the  participation  decision  is 

K(  =  0  if  52*  ^  f2J^ +i  , 

where  Yf  stands  for  the  dichotomous  decision 
by  the  ith  person  to  participate  in  the  labor 
force.  A  decision  maker  compares  the  largest  net 
present  value  among  attainable  occupations  to 
his  welfare  alternative.  If  the  largest  return  ex¬ 
ceeds  the  welfare  alternative,  the  agent  chooses 
to  participate  in  the  labor  force.  That  is,  an  in-' 
dividual  will  seek  employment  in  the  jth  occupa¬ 
tion  if  he  expects  a  net  monetary  gain  that  is  larg¬ 
er  than  any  alternative. 

The  value  of  the  welfare  alternative  and  the 
welfare  loss  due  to  employment  are  such  that 
they  may  be  unique  for  each  individual.  The  val¬ 
ue  depends  on  the  particular  welfare  program  re¬ 
quirements.  For  example,  h'(r)  may  correspond 
to  the  AFDC  full  state  standard  and  w(r)  may 
then  correspond  to  the  rate  of  reduction  in  the 
AFDC  full  state  standard.  For  a  description  of 
the  AFDC  program  see  Solberg  and  Langille 
[15].  The  decision  to  participate  is  dependent 
also  on  r(f)  which  is  determined  by  the  level  of 
the  wage  rate  r°,  the  level  of  r*  and  the  proba¬ 
bility  of  employment  in  the  jth  occupation. 

Associated  with  the  LFP  decision  is  a  critical 
value  of  the  wage  rate,  the  reservation  wage, 
above  which  Y  =  1 .  In  general  there  will  exist  a 
minimum  level  of  the  wage  rate,  rmin,  below 
which  no  individual  will  decide  to  participate. 
As  the  wage  rate  rises  above  rmin  a  greater  num¬ 
ber  of  individuals  will  decide  to  participate  where 
some  differences  in  the  critical  wage  exists  since 
tastes  vary  as  well  as  circumstance.  Let  Nj  denote 
the  total  number  of  individuals  participating  in 
the  jth  occupation  group,  then 

Nj  =  XYi(r°y,Nj<N  (14) 

Since  Yt  is  dependent  on  the  wage  rate,  so  is  Nj. 


The  value  N  represents  the  population  of  poten¬ 
tial  entrants.  The  N  function  is  a  step-function 
since  Y  is  dichotomous;  however,  with  large 
numbers  of  participants,  this  step-function  can 
be  approximated  by  a  smooth  curve  like  that  in 

F1gur«  II 

The  Ag9re4«t«  PlrttelPtttQA  Function 


SUPPtt 


Figure  II.  The  N{r°)  curve  in  Figure  II  is  that  of 
a  sigmoid  curve  and  is  consistent  with  a  unimodal 
distribution,  most  people’s  tastes  are  more  alike 
than  different. 

The  labor  force  participation  rate  (LFPR) 
traditionally  used  to  study  LFP  behavior  can  be 
computed  directly  from  the  aggregate  participa¬ 
tion  relation.  To  find  the  LFPR,  simply  divide 
the  equilibrium  Ne,  which  is  determined  by  the 
prevailing  wage  rate  (re),  into  the  total  available 
population;  thus,  LFPR  =Ne/N.  The  LFPR  has 
often  been  used  by  researchers  in  their  study  of 
LFP,  since  LFP  cannot  be  observed  unless  micro 
data  is  available.  Note  that  since  the  aggregate 
participation  function  is  sigmoid  shaped,  so  the 
LFPR  function  must  be  also. 

The  labor  force  participation  rate  has  the  nat¬ 
ural  interpretation  as  a  point  estimate  of  the 
probability  of  labor  force  participation.  If  0  = 
Pr[Y  =  1)  and  1  -  6  -Pr(Y  =  0),  then  the  ran¬ 
dom  variable 


is  binomial  over  n  independent  trials  and  LFPR 
=  XYi/n  is  an  unbiased  estimator  of  0.  Moreover, 
HYjn  is  asymptotically  normally  distributed 
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which  is  unimodal  and  has  a  sigmoid  shaped  dis¬ 
tribution  function. 

There  are  some  important  empirical  implica¬ 
tions  which  are  also  immediately  obvious.  First, 
the  LFPR  and  LFP  should  always  be  non-nega- 
tively  related  to  the  wage  rate.1  Second,  estima¬ 
tion  of  the  LFPR  calls  for  a  sigmoid  shaped  func¬ 
tional  form.  Finally,  the  liberalization  of  the  dis¬ 
regard  criteria  in  welfare  programs  like  the  AFDC 
Program  will  cause  an  increase  in  the  net  present 
value  of  any  occupation  for  those  agents  cate- 
*-  orically  and  financially  eligible  and  will  therefore 
increase  the  LFPR.  If  the  skill  constraint  restricts 
occupational  choice  to  the  secondary  labor  mar¬ 
ket,  there  will  be  little  loss  in  empirical  relevance 
if  occupation  is  defined  loosely  allowing  the 
-  comparison  of  the  effects  due  to  earned  income, 
f  welfare  income,  unemployment  benefits  income, 
other  independent  income,  and  the  probability 
of  employment  without  regard  to  occupation. 

4.  The  Data 

•  -  A  subfile  was  created  from  the  March  1970 
Person  Family  file  of  the  Current  Population 
Survey  which  selected  observations  if  the  family 
head  was  female  and  if  dependent  children  were 
present,  which  may  be  viewed  as  the  AFDC  pop¬ 
ulation-at-risk.1  Only  those  family  heads  who 
were  in  the  civilian  non-institutionalized  popu¬ 
lation  whose  primary  source  of  income  was  not 
gained  from  self-employment  in  agriculture  were 
included  in  the  universe.  Any  observation  which 
corresponded  to  family  heads  over  the  age  of 
70  years  was  omitted  in  order  to  limit  the  uni¬ 
verse  to  those  who  could  reasonably  be  expected 
to  participate  in  the  labor  force.  This  restriction 
also  tended  to  delete  cases  earning  retirement 
or  old  age  assistance.  These  limitations  resulted 
in  a  sample  of  2,222  observations,  1,284  with 
r=l  and  938  with  T  =  0. 

The  dichotomous  dependent  variable  was  as- 

3 If  pensions  are  included  and  the  pensions  depend 
on  wage  rates,  then  higher  wage  rates  may  cause  earlier 
retirement  and  reduce  labor  force  participation  at  a  lat¬ 
er  date.  Current  LFP  is  not  affected. 

3 The  AFDC  unemployed  parent  category  was  not 
included  in  this  study  since  this  constitutes  a  special 
and  minor  fraction  of  the  total  AFDC  population. 


signed  a  value  of  unity  if  the  head  was  working, 
with  a  job  but  not  working,  or  looking  for  em¬ 
ployment.  LFP  was  assigned  a  value  of  zero  for 
those  heads  who  were  at  home,  in  school,  unable 
to  work,  or  had  other  reasons  for  not  participat¬ 
ing. 

The  independent  variables  include:  expected 
earnings  (EARNINGS),  total  actual  earned  in¬ 
come  of  the  head  in  hundreds  of  dollars  multi¬ 
plied  by  one-minus  the  unemployment  r3te,  a 
proxy  to  measure  the  influence  of  the  probabil¬ 
ity  of  employment  j’’1  welfare  (WELFARE),  the 
combined  total  of  income  in  hundreds  of  dollars 
received  from  all  public  assistance  programs, 
AFDC,  Old  Aid  Assistance,  or  Aid  to  the  Blind 
and  Totally  Disabled;  expected  unemployment 
benefits  (UCB),  the  combined  total  of  income 
in  hundreds  received  from  unemployment  com¬ 
pensation,  workman’s  compensation,  govern¬ 
ment  employee  pensions,  and  veteran’s  pay¬ 
ments;  other  income  (OTHER  INCOME),  resid¬ 
ual  famiLy  income  in  hundreds  derived  by  sub¬ 
tracting  the  prior  income  categories  from  total 
family  income;  a  dummy  variable  (SMSA)  to 
identify  whether  the  family  resided  in  a  central 
city  SMSA;  a  dummy  variable  (KIDS)  which  in¬ 
dicates  the  presence  of  children  five  years  old 
or  less;  a  dummy  variable  (RACE)  indicating  the 
head’s  race  was  Black;  the  actual  age  of  the  fam¬ 
ily  head  (AGE);  the  highest  grade  of  school  at¬ 
tended  by  the  head  (EDUCATION).  In  addition 
to  the  income  variables  categorized  as  Finely  as 
the  CPS  would  allow,  the  other  explanatory  vari¬ 
ables  were  included  in  order  to  control  for  dif¬ 
ferences  in  tastes  between  individuals  or  environ- 

4  While  the  unemployment  rate  does  not  in  general 
measure  the  probability  of  not  Finding  a  job  for  an  in¬ 
dividual,  the  inverse  variation  between  the  probability 
of  employment  and  the  decision  to  participate  in  the 
labor  force  is  important  to  capture.  Wickcns  [17]  has 
shown  that  “. . .  it  is  better  to  use  even  a  poor  proxy 
than  to  use  none  at  all  and  omit  the  unobservable  vari¬ 
able.” 

5  An  attempt  to  create  an  instrument  for  earnings 
by  regression  using  the  characteristics  of  the  family 
head  as  explanatory  variables  was  abandoned  because 
of  the  extremely  low  predictive  ability  of  the  estimated 
relations;  therefore,  it  is  true  that  the  earnings  variable 
used  and  labor  force  participation  are  subject  to  tauto¬ 
logical  relationship.  This  does  not  detract  from  the 
main  point  of  the  paper. 
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mental  influences,  that  is,  to  destratify  the  sam¬ 
ple. 

5.  The  Results 

The  three  specifications,  linear,  logistic,  and 
urban,  were  used  to  obtain  forecasts  of  0/,  based 
upon  all  2,222  sample  observations  of  Tf  and 
the  nine  independent  variables  defined  in  Section 


4.  Results  are  shown  in  Table  1.‘  For  all  three 
specifications,  the  probability  0;  is  an  increasing 
function  of  the  argument  0X.  It  is  interesting  to 
note  that  for  all  three,  the  corresponding  regres¬ 
sion  coefficients  of  each  independent  variable 
in  Table  1  have  the  same  sign.  This  is  reassuring 
since  at  least  the  direction  of  influence  implied 
by  previous  research  is  likely  to  be  correct. 


TABLE  1 

REGRESSION  COEFFICIENTS 


Linear  Model 

Logistic  Model 

Urban  Model 

Intercept 

0.519 

-0.131 

-2.218 

Earnings 

0.010 

0.125 

0.319 

Welfare 

-0.010 

-0.042 

-0.043 

UCB 

-0.087 

-0.264 

-0.078 

Other  Income 

-0.001 

-0.009 

-0.012 

SMSA 

-0.026 

-0.416 

-0.636 

Kids 

-0.118 

-0.733 

-0.804 

Race 

0.047 

0.315 

0.317 

Age 

-0.004. 

-0.013 

-0.010 

Education 

0.009 

0.117 

0.153 

Model,  0  = 

0X 

exp[0  X] 

-  +  -tan1  [/JX] 

1  +  exp  [fi  X] 

2  n 

To  test  the  forecasting  ability  of  the  specifica¬ 
tions,  the  same  regressions  were  rerun  with  a 
randomly  selected  subset  of  1,111  out  of  the 
2,222  observations.  Then  the  resulting  equations 
were  used  to  forecast  0f  for  the  remaining  1,1 1 1 
observations.  The  1,111  forecast  0,'s  from  each 
specification  were  classified  into  20  cells,  and 
the  actual  frequency  count  of  Y;’s  in  these  cells 
was  obtained.  The  linear  specification  resulted 
in  141  infeasible  forecasts  of  0,-.  These  were  re¬ 
placed  by  0.0  for  0,-  <  0  and  1 .0  for  0,-  >  1 .  Re¬ 
sults  are  shown  in  Table  2.  To  compare  the  fit 
of  the  three  specifications,  a  chi-squared  statistic 
was  calculated  for  each:  for  the  linear,  x2  = 

‘The  logistic  was  also  examined  by  a  stepwise  like¬ 
lihood  estimation  technique  akin  to  stepwise  regression. 
The  order  of  entry  of  the  independent  variables  was: 
WELFARE,  EARNINGS,  EDUCATION,  OTHER  IN- 


103.66;  for  the  logistic,  x2  =  29.78;  and  for  the 
urban,  x2  -  27.49.  Clearly,  the  latter  two  are 
superior.  In  fact,  the  linear  specification  is  re¬ 
jected  at  the  5  percent  level  (x20  =31.41)  by  a 
goodness-of-fit  test. 

To  emphasize  the  danger  in  making  inferenc¬ 
es  from  the  linear  probability  model,  the  stan¬ 
dard  errors  of  the  estimated  coefficients  and  the 
corresponding  T-ratios  are  reported  in  Table  3. 
It  should  be  emphasized  that  the  T-test  is  not 
valid  for  the  linear  model,  unless  the  empirical 
distribution  was  shown  to  be  mound  shaped  and 
an  appeal  were  made  to  the  robustness  of  the 
statistic.  The  test  would  indicate  UCB  to  be  sig- 

COME,  and  KIDS.  The  remaining  independent  variables 
w'erc  not  significant  by  a  likelihood  ratio  test.  The  re¬ 
sults  are  available  on  request. 
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TABLE  2 

FORECASTS  OF  6,  FOR  LAST  1,111  OBSERVATIONS 
BASED  ON  FIRST  1,111  OBSERVATIONS 


Linear  Model 

Logistic  Model 

Urban  Model 

Forecast  0,- 

y=o 

Y=  l 

y=o 

y=  1 

*<; 

ii 

o 

y=  1 

0.0  -0.05 

29* 

0 

21 

1 

1 

0 

0.05-0.10 

18 

4 

60 

7 

83 

8 

0.10-0.15 

22 

3 

91 

6 

155 

14 

0.15-0.20 

30 

4 

59 

11 

87 

28 

0.20  -  0.25 

42 

7 

71 

19 

50 

14 

0.25  -  0.30 

56 

13 

45 

21 

25 

7 

0.30-0.35 

68 

12 

39 

16 

.  15 

9 

0.35  -  0.40 

77 

28 

26 

17 

8 

6 

0.40  -  0.45 

48 

27 

15 

12 

4 

7 

0.45  -  0.50 

42 

39 

8 

18 

3 

7 

0.50-0.55 

17 

38 

6 

14 

1 

2 

0.55  -  0.60 

9 

36 

4 

16 

0 

6 

0.60  -  0.65 

2 

45 

8 

14 

2 

3 

065-  0,70.. 

. _ _  6  - 

—  38 

2 

13 

1 

7 

0.70  -  0.75 

4 

37 

1 

16 

3 

7 

0.75  -  0.80 

0^. 

39 

4 

17 

3 

15 

0.80  -  0.85 

0 

43 

1 

32 

6 

22 

0.85  -  0.90 

_  2_ _ 

....  39 

4 

38 

8 

18 

0.90  -  0.95 

3 

27 

3 

54 

9 

96 

0.95-1.00 

1 

156** 

8 

293 

12 

359 

•  Totals 

476 

635  • 

476 

635 

476 

635 

*16  of  these  forecasts  were  less  than  0.0 
**125  of  these  forecasts  were  greater  than  1.0 


nificantly  different  from  zero  at  10  percent  level 
of  significance  in  the  linear  model.  But  UCB  fails 
in  the  logistic  or  urban  model.  Further,  SMSA 
is  not  significant  in  the  linear  model,  but  it  is 
significant  in  both  the  logistic  and  urban  model, 
but  it  is  not  significant  at  5  percent  in  the  logis¬ 
tic  or  urban  model.  The  RACE  variable  was  sta¬ 
tistically  insignificant  only  in  the  urban  model. 
Except  for  the  RACE  variable,  the  logistic  and 
urban  model  are  in  close  agreement,  but  they  are 
contradictory  to  the  linear  model  in  several  im¬ 
portant  variables. 

To  facilitate  comparison  between  the  linear 
model  and  the  highly  non-linear  logistic  and  ur¬ 
ban  models,  the  derivative  of  each  function  with 
respect  to  any  explanatory  variable  was  comput¬ 


ed  and  evaluated  at  the  means  of  the  explanatory 
variables.  These  results  are  reported  in  Table  4. 
Except  for  the  rates  of  change  of  the  dummy 
variables  SMSA,  KJD^,  and  RACE,  the  rates  of 
change  are  remarkably  similar  for  the  models 
with  one  important  exception,  the  earnings  and 
welfare  variables.  While  the  coefficients  of  the 
EARNINGS  and  WELFARE  variables  are  of  the 
same  magnitude  in  the  linear  model,  indicating 
equal  subjective  valuation  of  earnings  and  wel¬ 
fare  income,  the  coefficients  of  EARNINGS  are 
much  greater  in  magnitude  relative  to  the  coeffi¬ 
cients  WELFARE  in  both  the  sigmoid  specifica¬ 
tions.  Policy  implications  from  the  sigmoid 
curves  would  be  quite  different  from  those  im¬ 
plied  by  the  linear  model; 
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STANDARD  ERRORS 
(T-RATIO  IN  PARENTHESIS) 


Variable 

Linear 

Logistic 

Urban 

EARNINGS* 

3.90  E-04 

6.75  E-03 

3.10  E-02 

(25.64) 

(18.52) 

(10.27) 

WELFARE* 

8.90  E-04 

7.36  E-03 

1.38  E-02 

(11.23) 

(5.70) 

(3.12) 

UCB 

5.03  E-02 

3.45  E-01 

4.73  E-01 

(1.73) 

(0.77) 

(0.16) 

OTHER  INCOME* 

3.60  E-04 

2.74  E-03 

4.84  E-03 

(2.78) 

(3.28) 

(2.48) 

SMSA 

1.70  E-02 

1.37  E-01 

2.31  E-01 

(1.53) 

(3.03) 

(2.75) 

KIDS*  - 

2.17  E-02 

1.71  E-01 

2.76  E-01 

(5.44) 

(4.27) 

(2.91) 

RACE 

1.81  E-02 

1.43  E-01 

2.26  E-01 

(2.60) 

(2.20) 

(1.40) 

AGE 

9.10  E-04 

6.93  E-03 

1.00  E-02 

(4.40) 

(1.88) 

(1.00) 

EDUCATION* 

3.05  E-03 

2.43  E-02 

3.95  E-02 

(2.95) 

(4.81) 

(3.87) 

*Significant  by  likelihood  ratio  test  in  the  logistic. 

6.  Summary 

In  summary,  the  functional  form  makes  quite 
a  difference.  An  investigator  should  be  quite 
wary  of  making  generalizations  based  on  any 
single  specification  or  estimation  technique. 
However,  the  above  results  have  shown  in  strik¬ 
ing  fashion  the  superiority  of  MLE  of  the  sig¬ 
moid  specifications  over  the  OLS  estimation  of 
the  linear  probability  specification.  Although  the 
logistic  or  urban  specification  require  iterative 
solution,  this  is  no  barrier  on  a  modern  digital 
computer,  with  appropriate  special  algorithms. 
A  further  advantage  of  the  MLE  is  the  asymp¬ 
totic  normality  of  the  estimates  of  6(  which  per¬ 
mits  large  sample  interval  estimation,  and  the 
iteration  method  of  scoring  employed  yields  di¬ 


rectly  an  estimate  of  the  standard  deviation  of 
each  normally  distributed  0f.  Also  standard  tests 
of  significance  are  now  applicable. 

Perhaps  most  importantly,  the  sigmoid  spec¬ 
ifications  are  consistent  with  a  probability  inter¬ 
pretation  since  the  estimates  lie  inside  the  unit 
interval,  and  the  sigmoid  shape  is  consistent  with 
the  assumed  unimodal  distribution  of  the  partic¬ 
ipation  decision.  * 

In  conclusion,  results  reported  in  previous  in¬ 
vestigations  of  the  probability  of  labor  force 
participation  or  labor  force  participation  rate 
which  have  relied  on  the  least  squares  estimation 
of  a  linear  probability  specification  are  likely  to 
be  unreliable  as  to  the  magnitude  of  the  response 
attributed  to  changes  in  explanatory  variables. 
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♦These  rates  of  change  should  be  interpreted  with  caution  since  these  are  dummy  variables  which  might  best  be 
thought  of  as  location  parameters.  Here  SMSA  =  0.840,  KIDS  =  0.279,  and  RACE  =  0.726. 


sion  Analysis  when  the  Dependent  Variable  is 
Truncated  Lognormal,  with  an  Application  to 
the  Determinants  of  the  Duration  of  Welfare  De¬ 
pendency,”  International  Economic  Review, 
15,  No.  2,  1974,  pp.  485-96. 

(4J  W.  D.  Ashton,  “The  Logit  Transformation," 
Griffin's  Statistical  Monographs  and  Courses, 
No.  32,  Alan  Stuart,  Ed.,  Griffin,  London,  1972. 

[5]  J.  Berkson,  “Application  of  the  Logistic  Func¬ 
tion  to  Bio-Assay ,”  Journal  of the  American  Sta¬ 
tistical  Association,  39,  1944,  pp.  357-65. 

[6]  W.  G.  Bowen  and  T.  A.  Finegan,  The  Economics 
of  Labor  Force  Participation,  Princeton  Univer¬ 
sity  Press,  1969. 

[7]  Gerald  Brown,  “Nonlinear  Statistical  Estimation 
with  Numerical  Maximum  Likelihood,”  Western 
Management  Science  Institute  (UCLA)  Techni¬ 
cal  Report  222, 1974. 

[8]  Glen  G.  Cain,  Married  Women  in  the  Labor 
Force,  University  of  Chicago  Press,  1966. 

[9]  M.  S.  Cohen,  S.  A.  Rea,  Jr.,  and  R.  I.  Lerman, 
A  Micro  Model  of  Labor  Supply,  BLS  Staff  Pa¬ 
per  4,  U.S.  Department  of  Labor,  1970. 

[10]  Morley  Gunderson,  “Retention  of  Trainees,  A 
Study  with  DichotomousDependent  Variables,” 


Journal  of  Econometrics,  2,  1974,  pp.  79-93. 

[11]  L.  J.  Hausman,  ‘The  Impact  of  Welfare  on  the 
Work  Effort  of  AFDC  Mothers,”  The  President's 
Commission  on  Income  Maintenance  Programs- 
Technical  Studies,  1970,  pp.  83-100. 

[12]  M.  G.  Kendall  and  A.  Stuart,  The  Advanced  The¬ 
ory  of  Statistics,  2,  3rd  ed.,  Halner  Publishing 
Company,  1973. 

[13]  R.  G.  McGillivary,  “Estimating  the  Linear  Prob¬ 
ability  Function,"  Econometrica,  September 
1970,  pp.  775-76. 

[14]  Marc  Nerlove  and  S.  James  Press,  Univariate  and 
Multivariate  Log-linear  and  Logistic  Models, 
Rand-1306EDA/NIH,  December  1973. 

[15]  E.  J.  Solbcrg  and  F.  Langille,  “The  Wage  Rate, 
Potential  Work  Incentives,  and  Benefit  Payment 
Reduction  iwthe  AFDC  Program,”  The  Quarter¬ 
ly  Review  of  Economics  and  Business,  Summer 
1974,  pp.  85-100. 

[16]  Henri  Theil,  Economics  and  Information  Theo¬ 
ry,  Rand  McNally  and  Co.,  1967. 

[17]  M.  R.  Wickens,  “A  Note  on  the  Use  of  Proxy 
Variables,”  Econometrica,  July  1972,  pp.  759- 
61. 


